NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

CARAT CAKE: replacing paging via compiler/kernel cooperation

https://doi.org/10.1145/3503222.3507771

Suchy, Brian; Ghosh, Souradip; Kersnar, Drew; Chai, Siyuan; Huang, Zhen; Nelson, Aaron; Cuevas, Michael; Bernat, Alex; Chaudhary, Gaurav; Hardavellas, Nikos; et al (February 2022, Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems)

Virtual memory, specifically paging, is undergoing significant innovation due to being challenged by new demands from modern workloads. Recent work has demonstrated an alternative software only design that can result in simplified hardware requirements, even supporting purely physical addressing. While we have made the case for this Compiler- And Runtime-based Address Translation (CARAT) concept, its evaluation was based on a user-level prototype. We now report on incorporating CARAT into a kernel, forming Compiler- And Runtime-based Address Translation for CollAborative Kernel Environments (CARAT CAKE). In our implementation, a Linux-compatible x64 process abstraction can be based either on CARAT CAKE, or on a sophisticated paging implementation. Implementing CARAT CAKE involves kernel changes and compiler optimizations/transformations that must work on all code in the system, including kernel code. We evaluate CARAT CAKE in comparison with paging and find that CARAT CAKE is able to achieve the functionality of paging (protection, mapping, and movement properties) with minimal overhead. In turn, CARAT CAKE allows significant new benefits for systems including energy savings, larger L1 caches, and arbitrary granularity memory management.
more » « less
Full Text Available
Compiler-Based Timing For Extremely Fine-Grain Preemptive Parallelism

https://doi.org/10.1109/SC41405.2020.00057

Ghosh, Souradip; Cuevas, Michael; Campanoni, Simone; Dinda, Peter (November 2020, Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC 2020),)
null (Ed.)
In current operating system kernels and run-time systems, timing is based on hardware timer interrupts, introducing inherent overheads that limit granularity. For example, the scheduling quantum of preemptive threads is limited, resulting in this abstraction being restricted to coarse-grain parallelism. Compiler-based timing replaces interrupts from the hardware timer with callbacks from compiler-injected code. We describe a system that achieves low-overhead timing using whole-program compiler transformations and optimizations combined with kernel and run-time support. A key novelty is new static analyses that achieve predictable, periodic run-time behavior from the transformed code, regardless of control-flow path. We transform the code of a kernel and run-time system to use compiler-based timing and leverage the resulting fine-grain timing to extend an implementation of fibers (cooperatively scheduled threads),attaining what is effectively preemptive scheduling. The result combines the fine granularity of the cooperative fiber model with the ease of programming of the preemptive thread model.
more » « less
Full Text Available
Paths to OpenMP in the kernel

https://doi.org/10.1145/3458817.3476183

Ma, Jiacheng; Wang, Wenyi; Nelson, Aaron; Cuevas, Michael; Homerding, Brian; Liu, Conghao; Huang, Zhen; Campanoni, Simone; Hale, Kyle; Dinda, Peter (November 2021, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '21))

OpenMP implementations make increasing demands on the kernel. We take the next step and consider bringing OpenMP into the kernel. Our vision is that the entire OpenMP application, run-time system, and a kernel framework is interwoven to become the kernel, allowing the OpenMP implementation to take full advantage of the hardware in a custom manner. We compare and contrast three approaches to achieving this goal. The first, runtime in kernel (RTK), ports the OpenMP runtime to the kernel, allowing any kernel code to use OpenMP pragmas. The second, process in kernel (PIK) adds a specialized process abstraction for running user-level OpenMP code within the kernel. The third, custom compilation for kernel (CCK), compiles OpenMP into a form that leverages the kernel framework without any intermediaries. We describe the design and implementation of these approaches, and evaluate them using NAS and other benchmarks.
more » « less
Full Text Available

Search for: All records